Skip to content

Conversation

@jhamman
Copy link
Member

@jhamman jhamman commented Oct 20, 2025

(Marking this as a draft for now)

closes: #1595
replaces: #1483
xref: zarr-developers/zarr-extensions#25

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.md
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Oct 20, 2025
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

❌ Patch coverage is 75.31532% with 137 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.18%. Comparing base (950066b) to head (6eab392).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
src/zarr/core/chunk_grids.py 67.24% 94 Missing ⚠️
src/zarr/core/indexing.py 85.07% 20 Missing ⚠️
src/zarr/testing/strategies.py 89.15% 9 Missing ⚠️
src/zarr/core/array.py 81.39% 8 Missing ⚠️
src/zarr/core/metadata/v3.py 25.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3534      +/-   ##
==========================================
+ Coverage   61.86%   62.18%   +0.32%     
==========================================
  Files          85       85              
  Lines       10111    10604     +493     
==========================================
+ Hits         6255     6594     +339     
- Misses       3856     4010     +154     
Files with missing lines Coverage Δ
src/zarr/api/synchronous.py 36.61% <ø> (ø)
src/zarr/core/group.py 70.27% <ø> (+0.03%) ⬆️
src/zarr/core/metadata/v3.py 59.91% <25.00%> (+1.88%) ⬆️
src/zarr/core/array.py 68.69% <81.39%> (+0.07%) ⬆️
src/zarr/testing/strategies.py 94.11% <89.15%> (-3.70%) ⬇️
src/zarr/core/indexing.py 70.19% <85.07%> (+0.73%) ⬆️
src/zarr/core/chunk_grids.py 64.10% <67.24%> (+4.28%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Oct 20, 2025
Comment on lines +574 to +575
when you need to align chunks with existing data partitions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
when you need to align chunks with existing data partitions.
when you need to align chunks with existing data partitions.
The specification for this chunking scheme can be found [here](https://github.com/zarr-developers/zarr-extensions/tree/main/chunk-grids/rectilinear/).

This link doesn't resolve yet but it will when the spec is merged.

Comment on lines +614 to +615
With variable chunking, the standard `.chunks` property is not available since chunks
have different sizes. Instead, access chunk information through the chunk grid:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if .chunks just had a different type (tuple of tuples of ints)



@dataclass(frozen=True)
class RectilinearChunkGrid(ChunkGrid):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on just calling this class Rectilinear, and renaming the RegularChunkGrid to Regular? We could keep around a RegularChunkGrid class for compatibility. But I feel like people know these are chunk grids when they import them

)

@cached_property
def _cumulative_sizes(self) -> tuple[tuple[int, ...], ...]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, this is cached

Comment on lines 584 to 588
chunk_shapes_rle = [
[[c, r] for c, r in zip(draw(dim_chunks), draw(repeats), strict=True)]
for _ in range(ndim)
]
return RectilinearChunkGrid(chunk_shapes=chunk_shapes_rle)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs fixing

@given(data=st.data())
async def test_basic_indexing(data: st.DataObject) -> None:
zarray = data.draw(simple_arrays())
@given(data=st.data(), zarray=st.one_of([simple_arrays(), complex_chunked_arrays()]))
Copy link
Contributor

@dcherian dcherian Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the search space for the standard arrays strategy is so large, i made a different one complex_chunked_arrays that purely checks different chunk grids
with simple_arrays() we are only spending 10% of our time trying RectilinearChunkGrid so using this approach. We should boost number of examples too.

Comment on lines +668 to +669
2. **Not compatible with sharding**: You cannot use variable chunking together with
the sharding feature. Arrays must use either variable chunking or sharding, but not both.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this is a temporary limitation! There's a natural extension of rectilinear chunk grids to rectilinear shard grids.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Variable Chunking (ZEP003) in V3

4 participants